3. Identifying and Resolving Exchange Server Issues
The event logs are one of the most valuable resources available to administrators for identifying issues. Windows
2008 introduces some changes to the familiar tool. The legacy Windows
logs category still exists, and there is a new category for
Applications and Services. The Windows logs category adds two new logs:
the Setup log and the ForwardedEvents log. These are intended to contain events that apply across the entire system.
The Applications
and Services logs store events from a single application or
sub-component. This new channel is code-named the application's crimson channel and is part of the new event logging API. The Application and Services logs have four sub-types, shown in Table 5.
Table 5. Application and Service Logs Channels
TYPE | DESCRIPTION |
---|
Admin | Events that are primarily targeted at administrators and support personnel. Events in the Admin log should provide clear remediation steps that an administrator can perform. |
Operational | Operational events are used for analyzing and diagnosing a problem. Events in the Operational log may require more interpretation than events in the Admin log. |
Analytic | The events in the Analytic
log are not meant to be handled by user intervention. This log is
mainly used for tracing information and can generate a high volume of
data. By default the analytic logs are hidden and disabled. |
Debug | The Debug log is used by developers troubleshooting application issues. |
Exchange 2010 utilizes the Application and Service logs channels for HighAvailability and MailboxDatabaseFailureItems. These logs are located on a mailbox server by performing the following steps:
Open the Event Viewer MMC.
In the console tree, select and expand Applications And Services Logs, then Microsoft, and then Exchange.
Select the HighAvailability or MailboxDatabaseFailureItems channel.
If you follow these steps your Event Viewer MMC will resemble Figure 7.
The HighAvailability channel contains information from the Microsoft Exchange Replication service. The Active
Manager logs events related to mounting/dismounting, reseeding, and
other database operations. The Volume Shadow Copy Service (VSS), Cluster service, and TCP listener will also log events here.
The MailboxDatabaseFailureItems channel logs events associated with any failures that affect a replication mailbox database.
3.1. DAGs and Mailbox Copies
Exchange 2010 ships with a number of scripts that help collect and report on database
metrics. These in-box tools are an easy way to show whether the
Exchange system is meeting the defined service levels. Additionally,
these reports can help an administrator tune the environment
if not meeting the SLAs. The scripts are located in the [Exchange
install path]\scripts directory. They must be run from an EMS.
The first script is named CollectOverMetrics.ps1. It will collect and report on information related to fail and switchover statistics. Microsoft refers to these database moves as *overs;
a generic way to refer to any time the database moves between hosts. A
number of parameters are available for customizing the script, but most
useful is the ability for the script to output an HTML report. The following command will collect all of the metrics and generate a report:
./CollectOverMetrics.ps1 -GenerateHTMLReport
The script was rewritten for Service Pack 1, and shows considerably more output—more than 40 metrics. Table 6 lists some of the information that is returned in the HTML report.
Table 6. Statistics from the CollectOverMetrics Script
PROPERTY | DESCRIPTION |
---|
DatabaseName | The name of the DAG |
TimeRecoveryStarted | The start time and date of the *over |
ActionType | The cause of the *over (move, mount, or dismount) |
ActionTrigger | Actions may be initiated automatically or by an administrator |
ActionReason | Why the *over occurred |
Under 30s, Over 30s | The number of operations taking more or less than 30 seconds |
DurationOutage | The total time service was unavailable |
DurationDismount, DurationAcll, DurationMount | The amount of time spent at each stage of the operation |
NumberOfAttempts | How many times the *over was attempted |
LostLogs | The number of logs lost during the *over operation |
The second script is CollectReplicationMetrics.ps1. This is useful for troubleshooting because it collects metrics in real time. The output statistics are shown in Table 7.
Table 7. Statistics from the CollectReplicationMetrics Script
PROPERTY | DESCRIPTION |
---|
DATABASE REPORT |
|
DatabaseName | The name of the DAG |
ServerName | The name of the Server hosting a DAG |
HoursMounted | The length of time the active DAG has been mounted on a given host |
MinutesUnavailable |
|
MinutesResynchronizing | The length of time the mailbox database copy and its log files are being compared with the active copy of the database to check for any divergence between the two copies |
MinutesFailed | The
length of time the mailbox database copy is in a Failed state because
it isn't suspended and it isn't able to copy or replay log files |
MinutesSuspended | The
length of time the mailbox database copy is in a Suspended state as a
result of an administrator manually suspending the database copy |
MinutesFailedSuspended | The
length of time the Failed and Suspended states have been set
simultaneously by the system because a failure was detected and because
resolution of the failure explicitly requires administrator intervention |
MinutesDisconnected | The length of time the mailbox database copy is no longer connected to the active database copy |
AverageLogGenerationRate | The rate at which new logs are being generated |
SERVER REPORT |
|
HoursMeasured | The length of time the script collected performance data |
HoursUnavailable | The length of time the server was unreachable |
AverageMountedMinutes | The average length of time the server |
AverageLogReplayRate | The average rate for log replay |
PeakLogReplayRate | The peak rate for log replay |
3.2. Public Folder Troubleshooting
When it came to public folder management, Exchange 2010 seemed to take a step backward with public
folder management with the EMC. Fortunately, Service Pack 1 brings back
the ability to manage public folder settings. Because many companies
still have public folders deployed, this will make administration
easier for administrators not proficient with PowerShell and the *-PublicFolderClientPermission cmdlets.
Another addition in Service Pack 1 is a new Repair-PublicFolderDatabase cmdlet. This cmdlet can be used to detect and fix the following public folder corruptions:
Public folder replication state
Public folder view verification
Public folder physical corruption
Joe Cirillo
Senior Engineer, MCA:M, Horizons Consulting, Inc., USA
One of the most useful but
overlooked items regarding the operation and support of Exchange 2010
are the preconfigured scripts that are made available following
installation. These preconfigured scripts can be found at the following
data path:
<Install Drive>\Program Files\Microsoft\Exchange Server\V14\Scripts
There are 49 out-of-the-box
scripts available to assist you with daily operational tasks, one-time
configuration changes, and report generation. The following is an
example of how these scripts can be used.
Many companies have
Public Folder infrastructures that have grown unabated over the years.
One of the most challenging aspects of managing Public Folders is
determining which Public Folders are no longer being accessed by users.
Armed with this information, an administrator can perform some needed
housecleaning. The AggregatePFData.ps1 script can assist with this
daunting task.
The AggregatePFData.ps1 script aggregates and captures information collected from the following cmdlets:
This script has been updated in Service Pack 1 to deliver real aggregate information collected from all replicas.
Then, the following information is aggregated at the public folder level:
Last user access and last user modification times Owner of the public folder Other properties such as MailEnabled, HasRules, ItemCount, FolderType, HasModerator, and TotalItemSize
After this report is generated, the administrator can begin the cleanup process. What I like to do with Public Folders I deem unnecessary is to first simply hide the Public
Folder from the user's view. Hiding the public folder for a period of
time allows you to await any calls from users stating that their Public
Folders are missing. After an allotted amount of time passes with no
user complaints, you can perform a backup of the Public Folder database
for archival and then safely remove the hidden, unused Public Folders
from the database.
|
3.3. Client Access Server Troubleshooting
Of course, not everything
will always run as smoothly as planned. This section describes some
techniques to handle common issues or check on your Client Access Server's health.
3.3.1. Client Access Server Test Cmdlets
You can use a number of PowerShell cmdlets to test Client Access Server health. Table 8 lists the relevant PowerShell cmdlets with a description. Many of the cmdlets can target a specific user as a parameter. The New-TestCASConnectivityUser.ps1
script located in the scripts directory will create a test user that
you can use with these cmdlets.
Table 8. CAS PowerShell Test Cmdlets
CMDLET | DESCRIPTION |
---|
Test-MapiConnectivity | Tests the RPC Client Access service. Indirectly, this cmdlet also tests the directory service and mailbox store. |
Test-OutlookConnectivity | Used
to verify that OWA is running. It can be used to test all virtual
directories or an individual virtual directory. It can also be used to
test all mailboxes running in the same Active Directory site. It is
recommended that you run test-MapiConnectivity first to ensure that the mailbox is available. |
Test-OutlookWebServices | Verifies that the service information returned by AutoDiscover for the Availability Service, Outlook Anywhere, Offline Address Book, and Unified Messaging. |
Test-WebServicesConnectivity | Tests the functionality of Exchange Web Services by running GetFolder, CreateItem, DeleteItem, and SyncFolderItems operations over Outlook Anywhere. |
Test-EcpConnectivity | Verifies Exchange Control Panel Connectivity for all mailboxes on Exchange servers in the same site, or an individual ECP URL. |
Test-ActiveSyncConnectivity | Performs a full mailbox synchronization to verify the health of ActiveSync. |
Test-ImapConnectivity | Tests IMAP4 connectivity by creating and sending a special message to a mailbox. The cmdlet then logs on to the mailbox to check for the test message. If you use the LightMode parameter, only the logon is performed. |
Test-PopConnectivity | Tests POP3 connectivity by creating and sending a special message to a mailbox. The cmdlet then logs on to the mailbox to check for the test message. If you use the LightMode parameter, only the logon is performed. |
Test-PowerShellConnectivity | Used to test whether PowerShell remoting on the target Client Access server is healthy. |
Test-ServiceHealth | Tests to ensure that all Windows services required for the Client Access Server role are running. |
Test-SystemHealth | Runs a check of the Exchange environment against Microsoft best practices. To write to a file use the following two commands:
$temp=Test-SystemHealth -OutData
Set-Content -Value $temp.FileData -Path c:\temp\SystemHealthOutData.xml -Encoding Byte
You can then import this XML file into the Best Practices Analyzer tool found in the EMC Toolbox. |
3.3.2. Autodiscover
AutoDiscover
can be complex to configure correctly when you have a variety of
clients coming in from both inside and outside the corporate network.
Fortunately, Microsoft has given you a number of tools to help
troubleshoot AutoDiscover.
The first thing to check is a tool built into Microsoft Outlook. After
launching Outlook, press and hold the Ctrl key and right-click the
Outlook icon in the system tray. A hidden menu item will appear named
Test Email AutoConfiguration. Simply type in an e-mail address and
password and run the test. This will return all of the Autodiscover
information that it discovers. From the protocol property, you can see
the queries to find an SCP object or the queries via DNS as well as the
configuration returned for internal (RPC or EXCH) and external (HTTPS
or EXPR) connectivity. If it cannot locate Autodiscover information,
check the Log tab. This tab shows all of the different methods
Autodiscover uses to establish a connection. This can help narrow down
which methods are failing. Make sure from the client you can resolve
the DNS name correctly using a common tool, such as nslookup.
Another method for troubleshooting Autodiscover is using Windows PowerShell from an Exchange server. The Test-OutlookWebServices cmdlet tests Autodiscover and also the service settings Autodiscover returns. Simply run the Test-OutlookWebServices cmdlet with the –Identity parameter set to a user's e-mail address; for example, Test-OutlookWebServices –Identity [email protected].
3.3.3. Remote Connectivity Analyzer
The newest of the analyzers, the Remote Connectivity Analyzer, is not included in the EMC Toolbox. This Web-based tool is located at https://www.testexchangeconnectivity.com. Figure 8 shows the home page and the various tests an administrator can perform with this Web site.
Table 9 explains the various tests and how they can be used for troubleshooting.
Table 9. Remote Connectivity Analyzer tests
TEST TYPE | TEST DETAILS | USED FOR |
---|
Exchange ActiveSync | Test simulates a mobile device connecting using EAS. | Identify configuration or connectivity issues with ActiveSync. |
ActiveSync AutoDiscover | Test simulates an ActiveSync device obtaining its settings with AutoDiscover. | Identify configuration errors with AutoDiscover. |
Exchange Web Services General Test | Tests basic EWS tasks. | Useful for simulating EWS clients, such as Entourage, as well as for identifying configuration or connectivity issues with EWS. |
Exchange Web Services Service Account Access | Tests a service account's ability to access a specified mailbox and perform basic EWS tasks. | Primarily
used by application developers to test the ability to access mailboxes
with alternate credentials (Exchange impersonation). |
Outlook Anywhere | Tests the steps Outlook uses to connect with Outlook Anywhere. | Test Outlook Anywhere's configuration and connectivity. |
Outlook AutoDiscover | Tests the steps used by Outlook 2007 to obtain settings from AutoDiscover. | Identify configuration errors with AutoDiscover. |
Inbound SMTP Email | Tests inbound mail to a specified mailbox. This will check DNS MX record configuration and TCP Port 25 connectivity. | Validate your Exchange organization's ability to receive mail. |
Outbound SMTP Email | Checks outbound SMTP for connectivity and other issues, including Reverse DNS, Sender ID, and RBL checks. | Validates
your Exchange organization's ability to send mail. This is useful when
users are reporting that messages are not being delivered because they
are marked as spam. |
The Remote
Connectivity Analyzer provides step-by-step detail to pinpoint where
exactly in the test any failures occurred. Of course these tests are
all run from the Internet, so they will not help troubleshoot issues
from inside the corporate network perspective.
3.3.4. Certificates
Troubleshooting
certificates can be challenging. Even though the certificate management
MMC may import a certificate and not report any issues, Exchange may
fail to import the certificate. This is because the MMC snap-in only
requires basic hierarchy validation and does not perform additional
checks. Exchange requires a much more vigorous validation checking,
including a certificate revocation check. You can check the certificate
status with the get-ExchangeCertificate
cmdlet. If the import fails or there is an issue, the status field will
report invalid. Another symptom is the services using SSL or TLS will
fail to start, resulting in the following error:
Note:
Description
The IMAP4 service failed to connect using SSL or TLS encryption. A
valid certificate is not configured to respond to SSL/TLS connections.
Check the configured hostname as well as which certificates are
installed in the Personal Certificates store of the computer.
One way to troubleshoot these types of errors is to export the certificate and run the certutil utility to verify the certificate. The syntax is:
Certutil -verify filename
Examine the output and see whether the error relates to the ability to check the certificate's revocation status. The Exchange
server might be unable to connect to the revocation point because of a
server proxy configuration error or Internet connection problem.
3.3.5. IIS Virtual Directory Troubleshooting
Occasionally an administrator may need to re-create the IIS
virtual directories, such as OWA. Maybe the configuration was changed
inappropriately, or the Web site was damaged. In Exchange 2010 RTM,
this was only achievable with PowerShell cmdlets such as New-OWAVirtualDirectory. In Service Pack 1, this ability was added to the EMC. When an administrator selects a Client Access Server in the EMC, there is a new action called Reset Virtual Directory.
This action will guide the administrator through a process that will
delete and re-create a virtual directory back to the initial default
settings.